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merical solution of continuously controlled HJB equations and HJB obstacle 
problems. Our results include estimates of the penalisation error for a class of 
penalty terms, and we show that variations of Newton's method can be used to 
obtain globally convergent iterative solvers for the penalised equations. Fur- 
thermore, we discuss under what conditions local quadratic convergence of the 
iterative solvers can be expected. We include numerical results demonstrating 
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1. Introduction 

Problems of optimal stochastic control arise numerously in the mathematical anal- 
ysis of real- world phenomena (cf. [241 136j ). and, applying Bellman's principle of op- 
timality, they can often be reformulated as Hamilton- Jacobi-Bellman (HJB) equa- 
tions (cf. [U). Numerical approaches for solving HJB equations can roughly be split 
into two fields: Markov chain approximations (e.g. cf. [551 [221 HI] ) a-nd finite differ- 
ence methods (see pQ [3J [5] and references therein) . Even though both approaches 
can be found to be closely related (e.g. cf. [SZ]), there is a rather clear conceptual 
difference: Markov chain approximations make explicit use of the underlying sto- 
chastic model and solve for transition densities, while finite difference methods deal 
solely with the HJB equation and solve for the unique viscosity solution directly 

id.m)- 

We begin by introducing the two kinds of non-linear equations we aim to solve 
numerically in this paper. Initially, the problem formulations are slightly informal 
and primarily motivational, and any equation of the described type permitting a 
discretisation with certain properties fits into our framework. 

Problem 1.1. Let U C M 6e a compac|^ interval. Let Q C M" (with n £ N) be an 
open set, and let , m G U, be a given family of affine differential operators on 
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n. Find a function : i7 — >■ M such that 

(1.1) inf {CuV} = 0. 

Problem 1.2. In the situation of Problem \l.l\ let C be an additional affine differ- 
ential operator. Find a function W : M. such that 

(1.2) inf I supiCuW}, CW] = 0. 



Equation (1.1 1 is a standard HJB equation as arising from many problems of sto- 
chastic optimal control (cf. [HI [53] ) ; examples in which the control set U is truly 
infinite and does not reduce to a finite set include problems of indifference pric- 
ing of financial derivatives (cf. |1] and references therein) and gas storage valuation 



(cf. [71[S]). The second formulation, Problem 1.2 is an obstacle problem (cf. [3]) in- 
volving an HJB equation, and is a special case of an Isaacs' equation (cf. [ITIIT^ITB] ): 
considering an example from |31j , we will later see that the computation of an early 
exercise indifference price in an incomplete market model leads to an equation of 



the form ( 1.2 1, a context in which the similarity to an obstacle problem can clearly 



be traced back to the American exercise feature (cf. [H]). 



Evidently, if we replace (1.1) by 



(1.3) £V + inf {CuV} = 0, 



we can easily recover formulation (1.1 ) by moving CV inside the "inf" and defining 
Cu ■— Cu+C, u £ U; in some applications, the operator CV can be found to simply 
be the time derivative, i.e. CV — dV/dt, whereas in other cases it may constitute 
the main part of a considered equation. 

In this paper, we are concerned with the finite difference approximation of equations 



(1.1) and (1.2). Based on the - most helpful - results in pjl3ji2j, it can be shown, 
for most HJB equations, that a monotone, stable, and consistent finite difference 
approximation converges to the unique viscosity solution, which is the correct notion 
of solution since it usually corresponds to the stochastic model. However, even 
if convergence can be guaranteed, one still has to solve the resulting non-linear 
discrete systems. The only exception are fully explicit time-stepping schemes, which 
generally suffer from undesirable stability constraints. 

We assume a rather general class of discretisations, one that is likely to arise when 
applying the convergence results cited above, and we study how the discrete systems 
can be solved using penalty methods. 

Existing work on the topic includes [HI [321 [IH 1301 [ISj , where a policy iteration 
algorithm is used, and f40!, where a penalty approximation for HJB equations is 
introduced. For the solution of the penalised equations, we consider Newton-like 
iterative solvers, many properties of which we prove based on results in [321134) . See 
also [221 EOl [21] regarding the use of semi-smooth Newton methods in the context of 
variational inequalities and their interpretation as primal-dual active set strategies, 
which in turn are closely related to the method of policy iteration. 

We extend the previous results in two ways. First, we consider a compact set of 
controls; until now, except for [3D] , all algorithms were based on the assumption of 
a finite control set. Second, we also solve an HJB obstacle problem, which arises for 
example in mathematical finance when pricing early exercise claims in incomplete 
markets, e.g. in [31 ; to the best of our knowledge, the only existing algorithm 
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capable of handling this problem is the Ho-3 algorithm introduced in [3D], which 
can be considered as a nested policy iteration. The discussed techniques have many 
applications in mathematical finance. Besides the applications already considered 
in [121 mi SO] I which include uncertain volatility models, transaction cost models, 
and unequal borrowing/lending rates and stock borrowing fees, our methods are 
applicable to many problems from portfolio optimisation, including indifference 
pricing |42[ [4] and indifference pricing with early exercise features [31j . 



Structure of this Paper. We aim to devise numerical schemes for the solution of 
HJB equations and HJB obstacle problems. For a general overview of our numerical 
approach, we refer to Figure 1 in |40| . where a similar conceptual structure is 
depicted. 



Section^ We relate the two non-linear equations of Problems 1 1 . 1 1 and 1.2 each to 



a non-linear discrete system; the latter are chosen such that they are monotone 
and correspond to what naturally arises when applying a fully implicit or weighted 
time stepping discretisation to the equations. Since implicit schemes are usually 
unconditionally stable and finite difference schemes are naturally consistent, this 
setup allows to conclude convergence to the unique viscosity solution whenever a 
strong comparison principle holds (cf. [Il[3l[2]). We proof existence and uniqueness 
of solutions to the discrete problems and present some further properties. 

Section We present a penalised problem which approximates the discrete HJB 
equation to an accuracy of 0(l/p), where p > is the penalty parameter. A 
similar approach was used in [T3] for American options, and was extended to HJB 
equations in |40j . The novelty of the approach described in this paper is that we do 
not penalise for every control individually, as was done in [40] , but only penalise the 
maximum violation; this significantly simplifies the penalty formulation and allows 
for compact control sets. 

Section\^ We study the iterative solution of the penalised HJB equation presented 
in Section [3j We introduce two globally convergent Newton-type solvers, and we 
show that, when using a smooth penalty term, the classical Newton scheme can be 
expected to have local quadratic convergence. 

Sections anc?[^ In these two sections, we extend the techniques from Sections [3] 
and [4] to show that the use of penalisation (Section [s]) with subsequent non-linear 
iteration (Section [6| is also a powerful strategy for the solution of the obstacle 
version of a continuously controlled HJB equation. In particular, in Section [6j we 
present two Newton-type iterative methods, one that converges globally and one 
that converges locally quadratically. 

Section\^ The numerical strategies developed in Sections |3}|6] are designed for HJB 
equations and HJB obstacle problems with a compact set of controls and continuous 
dependence of the differential operators on the controls. For some equations, it may 
be convenient to approximate the dependence on the controls, e.g. by piecewise 
linearisation; in this section, we prove that the previously introduced algorithms 
are stable with respect to such perturbations. 

Section We conclude by presenting numerical results, solving an HJB equation 
taken from [32] and an HJB obstacle problem found in [3T|; in both cases, we 
compare our approach to the method of policy iteration (cf. [T21 130| ). 
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2. Problem Formulation 

Following [10], we introduce Zn , G N, to be the set of all real square matrices 
of dimension N whose off-diagonal entries are all non-positive, i.e. 

Zn := {A = (a,,) e M^><^ : a.. < 0, r ^ s}, 

and define 

Kn ■■= {Ae Zn -.3 X eR^, x> 0, s.t. Ax > 0} 

to be the set of M-matrices in M^; it can be found in [10] that A^^ > for all 
matrices A E K^- Furthermore, we introduce the set 

r^s 

which has the following properties. 

• If, for 1 < i < A^, we replace the i-th row of A G K'^ by the i-th row of 
some B G K'^ , the resulting matrix will still be in K'^ . 

• A + B e for two matrices A, B G K°j^ . 

Throughout this paper, the set K"^ will be one of our main building blocks since it 
allows us to deduce the M-matrix property of a matrix that was constructed from 
a number of other matrices. 

Remark 2.1. Let iV G N and define A/" := {1, . . . , A^}. Consider two vectors x, 
y G and a matrix A G M^^^. For any i G Af, we denote by {x)i and {A)i the i- 
th coordinate and the i-th row of vector x and matrix A, respectively. When writing 
X > 0, we generally mean that {x)i > for all i G M, and we take z := min{x, y} to 
be the vector satisfying {z)i — min{{x)i , {y)i} for all i G A/". The definitions extend 
trivially to other relational operators and to the maximum of two vectors. 

Making use of the definition of K"^ , we assume, for now, that we can find sensible 
discretisations of Problems |1.1| and |1.2| of the following form; we will give a more 
rigorous justification of this assumption later on. 

Problem 2.2. Let 

(2.1) 2l:U^M^^^ :uh^v4„ 

(2.2) and » ; U ^ : m fe„ 

be continuous functions, with A^ G K'^ , u E \J . Find x G such that 

(2.3) min{A„ a; - 6„} = 0. 

The continuity requirement of 21 and *B above is to be understood in the sense of 
any vector norm, say |1 • II2 or || • ||oo , in M.^^^ and M^. For example, for N = 2, 
the maps 21 and *S will be of the form 

and 3iiu) 



f2{u) fiiu) J \ 52 (m) 

respectively, with /i , /2 , /a , A , 5i , 52 e ^(U, M). 



2.2 



let also Ae K"^ and b e . Find 



Problem 2.3. In the setting of Problem 
zeR^ such that 

(2.4) min | max{A„ z - 6„}, Az -b\ = 0. 

We will, without always mentioning so explicitly, make frequent use of the fact that 
every function which is continuous on a compact interval attains its minimum and 
maximum on the interval. 
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Theorem 2.4. The discrete Problems 2.2 and 2.3 are both uniquely solvable. 



Proof. Corollary |3.6| and Theorem 4.1 give the existence of a solution to Problem 



2.2 and, similarly, Corollary 5.5 and Theorem 6.1 (or, alternatively, Ho-3 in [30] ) 
give the existence of a solution to Problem |2.3[ Hence, it remains to show the 
uniqueness of the solutions. To this end, suppose we have two solutions xi and X2 
to Problem 2^ For every i e A/", there exists a £ U such that 

{Aui)iX2 - (6«Ji = 0, 

and we also have 

{Aui)iXi - {bu,)i > 0. 

Denote by A* E M.^^^ the matrix consisting of the rows (^uji , i G Af. We have 
A* e K'^ and A*{xi — X2) > 0, from which we get xi — X2 > since {A*)~^ > 0. 
Conversely, using the same arguments but swapping xi and X2 , we can also get 
X2 — xi > 0, which then proves the uniqueness of a solution to Problem 2.2 Now, 
suppose we have two solutions zi and Z2 to Problem 
be such that 



2.3 



For i G TV, let u} , uf € \J 



(2.5) 



max(Atj Zj 



bu)i — {A^j Zj 



je{l,2}. 



From (2.4), we then get that, for i G Af, we have 
(2.6) min{(A„i zi - 5„i )i , {Azi - 6) J - min{(A„2 Z2 



(Az2 - 6),} = 0. 



We distinguish the following possible cases in (2.6). 

(i) We have {Azi - b)i - {Az2 - b)i = 0. 

(ii) We have {Azi — b)i — {A^2 Z2 — b^2)i — 0, in which case [A^i zi — A^i Z2)i > 
{Azi - b)i - {A^2 Z2 ~ b^2)i = 0. 

(iii) We have {A^i z\ — b^i)i — {Az2 — b)i = 0, in which case {Azi — Az2)i > 
(A„i zi - 6„i)i - {Az2 - b)i 0. 

(iv) We have {A^i z\ — b^i)i — {A^2 Z2 ~ b.^2)i = 0, in which case {A^i zi — 
Au] Z2)i > [A^i zi ~ 6„i), - (A„2 Z2 - b^2), = 0. 

At this point, we can show the uniqueness of a solution to Problem |2.3| by arguing 
analogously to the first half of this proof (where we introduced the matrix A* and 
used its M-matrix properties). □ 



Next, we show that the discretisation matrices appearing in (2.3) and (2.4) have 
some convenient boundedness properties. 

Lemma 2.5. For 21 as in Problem \2.S\ there exists a constant C > such that 

0<P„||oo,||A-l||oo<C, U&U. 

Proof. Since continuity of 2t implies directly the existence of a constant Ci > 
such that ||j4„||oo < Ci for all u € U, it only remains to prove the corresponding 
estimate for the inverse matrices. An M-matrix is, in particular, non-singular with 
positive determinant (cf. [10]), and, hence, we have that u 1— >■ det(A„) > is a 
continuous and positive function on U. As every continuous function on a compact 
interval assumes its minimum, there exists an e > such that det(A„) > e for 
all M G U. Now, in [10] . it can be found that, for a square non-singular matrix 
A = {ors) G M^^^, it is = {ars) if we define 
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where Agr is the cofactor of a^j. in the matrix A. Since the calculation of the cofactor 
is, like the determinant, a continuous operation on the entries of the matrix, there 
exists a constant C2 > such that, for u € U and {Au)~^ = {{au)rs), it is 



\iau)rs\ 



{Au)sr ^ i-^u) 



c 

< — , l<r,s<N. 
e 



dct{A^ 

This proves the lemma. □ 



Corollary 2.6. Suppose that, in the situation of Problem 2.2 we define 
Z^:=UxUx...xUcM^ 



N -times 



and a" : ^ {m ,U2 , un) ^ , 

where Au-^u2...un denotes the matrix having as i-th row, i € Af, the i-th row of A^.. 
In this case, U is a compact set, is a continuous function, and Au-^^u2...um G K^^ 
for every {ui , U2 , ■ • ■ ,un) G Furthermore, there exists a constant C > such 
that 

< C, [ui ,U2,. ■ ■ ,Un) ^U. 

Proof. The proof is identical to the one of Lemma |2.5[ the only difference being 
that we have to deal with a continuous function U ^M. instead of U M. □ 

3. Penalising the Discrete HJB Equation 



We begin by studying Problem |2.2[ which we call the HJB equation; for nominal 
differentiation. Problem |2.3[ which we discuss subsequently, will be referred to 
as the HJB obstacle problem. The penalty approximation used in this paper is 
an extension of ideas used for discretely controlled HJB equations in [40 and for 
American options in |13j . The penalised equation is a non-linear equation itself and 
can be solved by an iterative scheme (cf. Section|4| . 

We introduce the penalty term 

(3.1) n:M^^M^ (7ri(yi), 712(2/2),..., 7r^(yiv))*''. 

Assumption 3.1. For i G A/", we take Hi G C(M) to be a non- decreasing function 
satisfying 7rj|(_oo,o] = and Tri\i^Q,oo) > 0. 

Problem 3.2. Let uq G U, p > 0. Find Xp e such that 

(3.2) [Auo Xp -buo) - pmaxn(&„ - AuXp) = 0. 



In (3.2), contrary to 40 , a penalty is applied only to the maximum violation of 



the constraints; since the solution to Problem 13.21 can be shown to be bounded 



independently of p (cf. Lemma 3.4), this guarantees all constraints to be satisfied as 
p — 7> 00 (cf. Corollary|3.6[). 



We point out that, in Problem |3.2[ uq does not require any further specification, 
and all the results of this section can be shown to hold for any choice of uq E U. 



In Section 8.1 we will study the practical effects of uq in a numerical example. 
Lemma 3.3. If there exists a solution to Problem \3.S\ then it is unique. 
Proof. Suppose we have two solutions Xpi and Xp2 . From (|3.2[), we get that 



(3.3) Ay,g{xpi - Xp2) - pmaxn(6„ - A^Xpi) + pmaxn(6„ - A„Xp2) = 0. 

Now, let i E Af, and define un , Ui2 £ U to be such that 

max7ri((6„ - yl„Xpj)i) = 7ri((6„ - yl„ Xpj)i), j E {1,2}. 
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For the second and third term in (3.3 1, we have 
(3.4) 



- pniaxTiMbu - AuXpi)i) + /9max7rj((6„ - AuXp2)i) 



A 



Ui2 Xp2)i) 



< - pTri{{bui2 - Aui2 Xpi)i) + pTri{{bui2 - ^Ui2 Xp2)i). 
At this point, the idea is that if there exists a matrix A* e K"^ satisfying 

A*{xpi-Xp2) > 0, 

then we can deduce {xpi — Xp2) > since (A*)^^ > 0, and we get uniqueness as in 
Theorem |2.4| We distinguish between the following two cases. 

• If ■ni[{hu,2 - ^Ui2 Xpi)i) > TTi[{bui2 - ^u,2 Xp2)i) > 0, then ([O]) is non- 



positive, which means the first term in ( |3.3| must be non-negative, and we 
can use {A*)i = (A^J^ . 
• If < TTi{{bu^^-Au,2 Xpi)i) < TTi{{bui2-Aui2 Xp2)i),then (6„,2-^«i2 ^p2)i > 
{bui2 ~ ^Ui2 Xpi)i , which is equivalent to (^^.^(xpi — Xp2))^ > 0, and we 



can set {A*)i = (A„,J, 



□ 



In Section [4j we will discuss several choices of the penalty function 11 that allow to 
conclude the existence of a solution to Problem |3.2| Here, we study the approxi- 
mation of Problem |2.2| by Problem |3.2[ 



Lemma 3.4. Suppose there exists a solution Xp to Problem 3.2 There exists a 
constant C > 0, independent of p and 11, such that ||xp||oo < C. 



Proof. We rewrite (3.2) to get 
(3.5) A 



Uq -^P 



pmaxll(bu - Au Xp) + bu„ . 



Hence, for every component i £ Af, we have that 

(Aug Xp)i = {bu„)i or 3 M e U s.t. (A„ Xp)^ < (6„)j . 

Therefore, we may, for i e Af, define Ui G U to be such that {Au^ Xp)i < (6uji is 
satisfied, and, introducing A* e K'^ to be the matrix having as i-th row the i-th 
row of Aui , i £ Af, and defining b* € correspondingly, we have 

A* Xp < b*. 



From Corollary 2.6 we then get Xp < {A*)^^b* < C for some constant C indepen- 
dent of p and H. To see that the negative part of Xp is bounded as well, we note 
that, from (3.5), we have 



Xp ^ (Auo) ^Uq 



in which (A^J^i > 0. 



□ 



Lemma 3.5. Suppose there exists a solution Xp to Problem 3.2 for every p > 0. 
There exists a constant C > 0, independent of p and H, such that 



(3.6) 



mm 

ixGU 



max{ A„ Xp - bu ,0} - n(6„ - AuXp) < C/ p. 



Proof. For u G U, we have 



pll{bu - Ai, Xp) < puiBxU^by - Ay Xp) 

A^iQ Xp ^Uq : 



and, applying Lemma 3.4 to the last expression, we get that 
(3.7) < H(6„ - A, Xp) < C/p 
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for some constant C > independent of p. Furthermore, for every i e M , we either 
have [Aua Xp — bu„)i = 0, in which case 



max{(A„g Xp - 6„Ji ,0} - 7ri((fe„o " AuoXp)i) = < C/p, 



or 3 u* S U such that 7r,i((6„. — 2;p)i) > 0, which, based on (3.7), gives 



max{(A„. Xp - bu*)i ,0} - 7ri((6„ 



□ 



For the canonical choice n(-) ~ max{-,0}, (3.6) reduces to 



(3.8) 



{AuXp - 6«}|L < C/p, 



stating that Xp satisfies Problem 2.2 to an order 0(l/p); if the set U is discrete. 



estimate (3.8 1 matches similar results in [T3] and 



Corollary 3.6. Suppose there exists a solution Xp to Problem 3.2 for every p > 0. 
As p — >■ oo, Xp converges to a limit x* which solves Problem 2.2. 



Proof. Since {xp)pyo is bounded (as seen in Lemma 3.4 1, it has a convergent sub- 
sequence, which we do not distinguish notationally; we denote the limit of this 
subsequence by x* . From Lemma 3.5 we may deduce that 

(3.9) Au X* -bu>0, ue U. 

Now, consider a component i G Af. Using again Lemma |3.5[ we know that, for 
every p > 0, there exists a Uip € U such that 

(3.10) < (Au,^ Xp - < C/p or < ^T^{{bu,^ - A^,^ Xp),) < C/p. 

Recalling that U is compact and {uip)pyo C U, we can then infer that there exists 
once more a subsequence of (a;p)p>o , again not notationally distinguished, and a 
M* e U such that Uip u* ; hence, recalling the continuity of 21 and S in the 
definition of Problem 2.2 it follows from (3.10) that 

(3.11) \{Au': 



b,,.* ] 



0. 



Altogether, combining (3.9) and (3.11), we see that x* solves Problem 2.2 Since 
Problem 2.2 has a unique solution (cf. Theorem 2.4), the just given result holds not 
only for subsequences of (xp)p>o , but the whole sequence converges. □ 

We have finally done enough preparatory work to show that - if we choose the 
penalty function 11 correctly - the solution to Problem |3.2| is indeed a good approx- 
imation of the solution to Problem |2.2[ in fact, we can obtain an approximation 
that is of first order in the penalty parameter. 

Theorem 3.7. Suppose there exists a solution Xp to Problem \3.S\ for every p > 0. 
Let X* be the solution to Problem \2.2\ If, in addition to Assumption \3.l\ we have 

'^i I (0,oo) 

G Ci((0,c5o)), ^ e M, and inf{^(2/) : y G (0,oo), i & M} > c„„„ > 0, 
then there exists a constant C > Q, independent of p and IT but depending on Cmin , 
such that 

- XpWoo < C/p. 

Proof. Without loss of generality, we may assume that Cmin < 1- Let i G J\f. We 
use u* as introduced in the proof of Corollary 3.6 which means, in particular, that 
(3.11 ) holds. Furthermore, for p > 0, we define u™'" G U to be such that 

^■mtn ^ fargmax„gu - Xp)^) if {A^^Xp - buo)i > 0, 

''^ \uo if {AuoXp - bua)i = 0, 
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which means, as seen in the proof of Lemma |3.5[ that 
1 



< 



max{(^„™i„ Xp - 6„™.,.)i ,0} - 7r,((6„™.„ - A^n^i^Xp)^) \ <Ci/p 



for some constant Ci > independent of p and 11. Applying ( 3.10| , ( |3.11| ) and the 



definition of u^"^"^, we now have that 

{xp - X*) < Xp - < Ci/p 

and 

(^•U*)* ~ ^p) ^ (^U* ^ ~ ~ (A miTi Xp — b 77iin)i 

Denoting by , A2 E K'^ the matrices having as i-th row, i e A/", the i-th rows 
of A rain and ^t,. , respectively, we get that 

xp~x*<\M^n:^ .*-.,<M^l):^, 



and, using Corollary |2.6[ we may infer that 

||a;* - XpWoo < C2/P 

for some constant C2 > independent of p and H. □ 
4. Solving the Penalised HJB Equation by Iteration 



In the previous section, we have seen how Problem |2.2| can be approximated by 
penalisation. We will now discuss iterative methods for the solution of the penalised 



problem, i.e. algorithms for the computation of y G satisfying ( |3.2[ ) 
Defining 



Giy) := {Auo y - bu„) - pmajin{bu - Auv) , y G M^, 

we need to solve G{y) = 0. For i E Af, y E M.^ , we define 

:= argmax7r,((6^, - A„ y),), 

and we assume that 7r.i|(o,oo) G C'^((0jOo)). For some of the following theorems, we 
require further that, for i, j E Af, 

("^•1) Ji^T. max7r,((5„ - A„y),)J = - ^n^-fe) 



dy 



holds whenever ^ is well defined, with u™™ E C(M^,U); the Jacobian of G is 



;n 

then given by 
(4.2) (JG(y)), = (^uo). - 

for i e A/", y e M^; if ^ does not exist at 0, we set ^(0) := limj^^o ^(y) = 0. 
For the numerical examples following later in this paper, (|4.1[) is generally satisfied. 



The next theorem states that a globally convergent iterative scheme for the solution 



of Problem 3.2 exists whenever 11 is a smooth and non-decreasing function. 
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Theorem 4.1. //, in addition to Assumption 3.1 for i e Af, tt^ e C"'^(M) and 
> 0, and (4.1 1 is satisfied, then Newton's method with line search as introduced 



by J.-S. Pang in |32) presents a globally convergent iterative scheme for the solution 
of Problem \3.S\ In particular, this means that there exists a solution to Problem 



Proof. We will show that conditions {a)-{d) in Theorem 4 in are satisfied. 
First of all, since G is continuously partially differentiable, it is difFerentiable as a 
function M.^ — >■ M.^ , which, in particular, guarantees i3-differentiability of G. In 
(a), we need to show that, for arbitrary S R^, the set {y € : \\G{y)\\oo < 
||G'(2/°)||oo} is bounded; since, for i e we have y)^ > {bug)i , and either 
(Ao y)t < + oi' i^u^i-Hy) y)i ^ (^u™'"(y))» , this can easily be inferred 

by applying Corollary |2.6| The partial derivatives of 11 are non-negative, which 
means Jaiv) & , U & , and, hence, (6) holds. Given the continuity of the 
partial derivatives of G, we can apply Lemma 1 and Theorem 2 in j32) to obtain 
(c). Finally, in (d), it is sufficient to show that, for any compact set S C M, there 
exists a constant c > such that 

(4.3) \\JGiy)v\\oo>c 

for ail y G B, V G K^, |lw||oo = 1- We prove (4.3) by contradiction, i.e. suppose 



there exist sequences (2/")n>i C B and {v")n>i C K^, ||w"||oo = 1 for all n > 1, 
such that lim„_j.oo || >/g(2/")w"||oo — 0. Based on the boundedness of the sequences 
{y^)n>i and (w")n>i , we infer the existence of subsequences - not notationally 
distinguished - converging to limits y* G B and v* S M^, respectively, where 
||w*||oo = 1, and we get lim„^oo |1 JG(y")^'"||oo = || II oo = 0. Now, from 

^ > 0, i e TV, we have that Jaiu*) S K% , and we note that JG{y*)v* = 
implies v* =0, which is a contradiction to ||w*||oo = 1- Hence, we may conclude 



that (4.3) must hold true, which completes the proof. □ 



Having seen that Newton's method with line search can lead to a globally convergent 
scheme, we next come to look at Newton's method in its classical form. 

Algorithm 4.2. (Newton's Method for the pen. HJB Equation) Let x° G be 
some starting value. Then, for known x"" , n > 0, find a;""'"^ such that 

(4.4) JG(a;")(x"+i - x") = -G(x"). 



Theorem 4.3. Suppose the conditions of Theorem 4^.1 are satisfied, and let Xp be 
the solution to Problem 3.2 There exists a neighbourhood B of Xp such that, for 



any starting value £ B, the Newton sequence (2^")n>o generated by Algorithm 
4-2 is well defined, remains in B and converges to Xp ; the rate of convergence is 



quadratic. 

Proof. If G has a strong and non-singular _F-derivative at Xp, then the result can 
be found in Theorem 3 in |32j . Given the continuity of the partial derivatives of 
G, we can apply Theorem 2 in to obtain the existence of a strong F-derivative 
of G, which coincides with Jq. Since ^ > 0, i £ Af, implies Jai^p) € K"^ , the 
strong _F-derivative of G at Xp is indeed non-singular. □ 



Now, if we use H(-) = max{-,0} (in which case, based on (4.2 1, Jq is still well 



defined), (4.4) simplifies, and we get the following special case of Algorithm 4.2 



Algorithm 4.4. (Newton-like Method for pen. HJB Eq.) For i e M and y e , 

, y^rnm(^y^ G V to be such that 



(4.5) u™" (y) = arg max max{ - A„ y), , 0} 

■ueu 
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and set A™^'^{jj) £ M^^^ and h"^™{y) E M.^ to be matrix and vector consisting of 

• rows (^„n,in(j^-))i and , i G A/", respectively, if we have — 

• and having zero rows if {b^^mi-nf^y-^ — A^mi^f^y-j y)i < 0. 



If Tl{y) = max{?/,0}, we now have Jciv) = + pA"''"{y) e K$f , and ^L4^ 
becomes 

which is equivalent to 

(4.6) (A„„ +pA™™(x"))a;"+i = 6„„ + p 6™"(a;"). 

Lemma 4.5. Le< x° G be some starting value, and let (a;")^o ^'^^ sequence 



generated by Algorithm 4-4 have x" < x""*"^ for n > 1. 
Proof. Writing ( |4.6[ ) for x" and x""*"^, we obtain 

+pA™^"(x"))x"+i = 6„„ +p&™"(x") 
and (A„„ + p A™"(x"-i))x" = 6„„ + pfe™"(x"-i), 
and subtracting yields 

(4.7) (A„,+pA'"™(x"))(x"+i-x") 

= p 6"'" (x" ) - p A"" (x" ) x" - p 6™" (x"- 1 ) + p A™™ (x"- 1 ) x" . 

In this, + pA"""(x") € i^T^ has a non- negative inverse, and, therefore, the 



proof is complete if we can show that the right-hand side of expression (4.7) is 
non-negative; to do this, we consider the rows i E JV oi the right-hand side of (4.7) 
separately. 



(i) If (6™™(x"-i)-A™™(x"-i) x"-i)^ = 0, the right-hand side of (|4j]) equals 

(p6™"(x") - p A™™(x") x")^ > 0.' 
(h) If (6™"(x"-i)-A™™(x"-i) x"-i)^ > 0, the right-hand side of (|4j]) equals 

(p 6"'" (x" ) - p A™" (x" ) x" - p 6™" (x"~ 1 ) p vl"*" (x"- 1 ) x" ) ^ 
> (p6™"(x") -pA™"(x")x" -p6™"(.x") +pA™"(x")x")^ = 0. 

This completes the proof. □ 



Lemma 4.6. Let (x")^g be the sequence generated by Algorithm 4-4 There exists 



a constant C > such that, for every starting value x" , ||x"||oo < C, n > 1. 
Proof. We may write ( |4.6[ ) as 

(4.8) x"+i = -I- pA™"(x"))"' + p5™"(x")). 



The boundedness of the right-hand side of K.Sl follows from Corollary 2.6 □ 



Theorem 4.7. Independently of the starting value x*^ , Algorithm 4-4 converges to 



a limit Xp which solves Problem 3.2 



Proof. From Lemmas 4.5 and 4.6 we have the existence of a limit Xp such that 
x" — > Xp as n — >■ oo. Hence, it only remains to show that the limit Xp satisfies 



G(xp) — 0. This follows easily from (4.4), the boundedness of Jq and the continuity 



of the function G. □ 
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5. Penalising the Discrete Obstacle Problem 



Having dealt with Problem 2.2 (termed HJB equation) in the previous sections, we 
now consider Problem 2.3 (termed HJB obstacle problem)^ which has to be treated 
differently due to the combination of "min" and "max" operators. We use the same 



penalty term 11 as introduced in (3.1), and we again make Assumption 3.1 

1^ such that 



Problem 5.1. Let p > 0. Find Zp £ 
(5.1) 



max{A„ Zp ~ 6u} - p^{b - Azp) = 0. 



If one treats the "max" inside the "min" in ( |2.4[ ) as if |U| = 1, then the penalty 
formulation (5.1 1 corresponds to the penalisation technique introduced in |13) . and, 
thus, (5.1 1 has a heuristic interpretation; however, mathematically, the situation is 



a bit more subtle, and we will now study various properties of Problem |5.1 
Lemma 5.2. If there exists a solution to Problem\5.1\ then it is unique. 



Proof. Suppose we have two solutions Zpi and Zp2 . For i £ M , we use u\ , m| G U 
as defined in (2.5 1. From equation ( |5.1[ ), we then get that, for i £ A/", 

(A„i(zpi ~ Zp2))^- piii{(h- Azpi)i) + pTTi{(b - Azp2)i) 
> [A^iZpi - b^i - A^2Zp2 + 6„2)j - pTTi{(b- Azpi)i) + pTri((b - Azp2)i) = 0. 



At this point, we have an expression similar to ( |3.3p , and we can finish off by 
following the lines of the proof of Lemma |3.3| □ 



Lemma 5.3. Assume there exists a solution Zp to Problem 5.1 for every p > 0. 
There exists a constant C > 0, independent of p and II, such that 



II iZp)p>o\\oo < C. 

Proof. Applying Corollary |2.6[ we can argue as in the proof of Lemma |3.4| 



□ 



Lemma 5.4. Assume there exists a solution Zp to Problem 5.1 for every p > 0. 
There exists a constant C > 0, independent of p and II, such that 



(5.2) 



lin < max{A„ Zp — b^}, ma,x{Azp — fe, 0} — 11(6 — Azp) > < C/ p 



Proof. Based on Lemma 5.3 we can directly infer from (5.11 that 

< n(6 - Azp) < C/p. 
At this stage, we obtain estimate (|5.2[) by pointing out that, for i £ Af, either 



max„gu(Ait Zp ~ bu)i = (which also means ni{{b — Azp)i^ = 0) or ma.x{{Az 
b),,0}-n,{{b-Azp),) > ~C/p. 

For the canonical choice n(-) = max{-,0}, (5.2) becomes 



□ 



im \ ma.x{ AuZp- bu},Azp-b\ <C/p, 
y. «eU J oo 



meaning that Zp satisfies Problem 2.3 to an order 0(l/p); this property is consistent 
with the penalty approximation of the HJB equation discussed in Section [3] 



Corollary 5.5. Suppose there exists a solution Zp to Problem 5.1 for every p > 0. 
As p oo, Zp converges to a limit z* which solves Problem 2.3 



Proof. This can be shown by arguing as in the proof of Corollary 3.6 



□ 
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Theorem 5.6. Suppose there exists a solution Zp to Problem 5.1 for every p > 0. 
Let z* he the solution to Problem \2.3\ If, in addition to Assumption \3. 1\ we have 
TTi G C^((0,(X))), i e TV, and inf{^(j/) : y G (0,oo), i G TV} > c,„i„ > 0, then 
there exists a constant C > 0, independent of p and 11 but depending on Cmin , such 
that 

\\z* - 2p||oo < C/p. 



Proof. We proceed similarly to the proof of Theorem |3.7[ Without loss of generality, 
we may assume that Cmin < 1- Let i G J\f. Let {A* ,b*) £ {{A, b)} U {{Au , 6„) : u G 
U} be such that 



min I max(A„ z* - , {Az* -b)i\ ^ {A* z* - b*), ^ 0. 



Similarly, applying Lemma 5.4 let {Aip,bip) G {{A,b)} U : u G U} be 

such that 

\{Aip Zp - hip)i\ = min | max(74„ Zp - , {Azp - h)i \ 

lin I niax(A„ Zp - hu)i , max{(y4zp - b)i , 0} - 7ri((6 - iizp)i)| 



1 

< 



for some constant Ci > independent of p and H. Furthermore, as already in ( |2.5[ ), 
let u* , G U such that {Au' z* — bu*)i — max„gu(^M-z* — and {A^p Zp — 
bup)i — maxueviAu Zp — bu)i ■ We first consider Zp — z* , for which we distinguish 
two cases. (C2 , C3 > are taken to be constants independent of p and E.) 

(i) If A,p = A, we have (^^p(zp - z*))^ < [Azp - 5), - {A* z* - b*), < Cs/p. 

(ii) If A^p = A„P , we have (A„. (zp - 2*)) . < (A„f. Zp - bup)i - {A* z* - b*)i < 

Now, for the reverse case, z* — Zp ^ similar estimates can be obtained by swapping 
the roles of "*" and "p" . The proof can be completed by following the lines of the 



proof of Theorem 3.7 □ 



6. Solving the Penalised HJB Obstacle Problem by Iteration 

In Section [4j we have discussed how to iteratively solve the penalty approximation 
of the discrete HJB equation. Similarly, in this section, we discuss iterative methods 
for the penalty approximation of the discrete HJB obstacle problem. 



Defining 



H{y) := max{A„ y - 6„} - pH(6 - Ay, O) = 



for y G K^, to solve (5.1), we need to compute y such that H{y) = 0. For i G TV, 
y G K^, we define 

(6.1) ^""^(2/) := argmax(A„?/-5^),, 

and we assume that ^T^\(^o,oo) & C^{{0,oo)), G C(M^,U), and 

(6.2) ^[max(A„y-6„)J = e C(M^,M) 
for i, j G TV. The Jacobian of H is then given by 

f/TT ■ — — ~ 

(6.3) (My))^ := (Arn.)). +P^{ib-Ayh){A). 
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for i e J\f, y G M^, whenever ^ is well defined; if ^ does not exist at 0, we set 
^■(0) :— liniy^o ^(z/) — 0- Like in Sectionjij we point out that, for the numerical 
examples following later in this paper, (16. 21) is generally satisfied. 



As already in Theorem |4.1[ whenever 11 is a smooth and non-decreasing function 
we can find a globally convergent iterative scheme for the solution of Problem |5.1 



Theorem 6.1. //, in addition to Assumption 3.1 for i £ J\f, iTi G C"'^(M) and 
> 0, then Newton's method with line search as introduced by J.-S. Pang in |32) 
presents a globally convergent iterative scheme for the solution of Problem \5.1\ In 
particular, this means that there exists a solution to Problem\5.1\ 



Proof. The proof is a minor modification of the proof of Theorem |4.1[ □ 



Knowing that Problem |5.1| has a solution for sufficiently smooth penalty terms 11, 
we can now proceed to showing that there also exists a solution if we penalise using 
the max-function. 



Corollary 6.2. //n(-) ~ max{-,0}, then there exists a solution to Problem 5.1 



Proof. Let (n^)£>o be a sequence of penalty functions satisfying irf G C°°(M), 
< ^ < 1 and ||max{?;,0} - ff(2/)||oo < e for i G TV, y G M, e > 0, and 



let (z*)(:>o be the sequence of solutions to Problem 5.1 corresponding to (n*)£>o 



Based on Lemma 5.3 we know that there exists a subsequence of {z*)^^q , not 
notationally distinguished, that converges to a limit z* G as e 0. We aim to 
show that z* solves Problem |5.1| for n(-) ~ max{-,0}. We have 



11 max{^„ z* — bu} — /omax{6 — Az* , 0}||oo 
= II max{^„ — ba} — p^''(b — Az^) — max{^„ z* — b.^} + pniax{5 — Az* , 0}|| 

u^JJ U^JJ 

< II max{^„ z^ - bu} - max{A„ z* - 6„}||oo 

+ \\pn'{b-Az,) -pmax{6-iz*,0}|loo , 

in which we obtain 

lim II max{A„ z^ - bu} - max{A„ z* - 6„}||oo = 



if we argue as in the proof of Corollary |3.6[ and 

||pn'(6 - Az^) - pmax{6 - iL2:*,0}||oo 

< \\pU'{b-Az*) ~ pma.x{b - Az* ,0}\\^ + WpU" {b - Az*) - pn'{b-A2 

< pe + p\\zt — z* 



OO 



as e — >■ by the properties of (n'^)e>o ; hence, we have max„gu{^u z* — bu} — 
pmax{b- Az*,0} = 0. □ 

Having established that Newton's method with line search can be used to solve the 
penalised HJB obstacle problem, we proceed to Newton's method in its classical 
form. 

Algorithm 6.3. (Newton's Method for the pen. HJB Obstacle Prob.) Let z" G 

be some starting value. Then, for known z", n> 0, find z"^^ such that 

(6.4) JH(z")(z"+i-z") = -ff(z"). 
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Theorem 6.4. Suppose the conditions of Theorem 6.1 are satisfied, and let Zp be 



the solution to Problem 5.1 There exists a neighbourhood B of Zp such that, for 
any starting value z'^ G B, the Newton sequence (2:")„>o generated by Algorithm 
6. 3 is well defined, remains in B and converges to Zp ; the rate of convergence is 
quadratic. 

Proof. The result follows from Theorem 3 in [32] if H is S-differentiable and has a 



strong and non-singular _F-derivative at Zp . Recalling (6.2), we can argue as in the 
proof of Theorem 4.3 to obtain i3-differentiability of H. From Theorem 2 in [32], it 



follows that H has a strong F-derivative, which is non-singular since Jniy) G 
for aU yeR^. □ 

We point out that, based on the assumptions of Theorem|6.1[ the Lipschitz property 



of Jh , additionally required in Theorem 6.4 depends entirely on the regularity of 
y (->■ m.axuev{Ai y - K} in ([6^. 



If we use n(-) = max{-,0} (in which case, based on (6.3), Jh is still well defined) 



(6.4) simplifies, and we get the following special case of Algorithm 6.3 



Algorithm 6.5. (Newton-like Method for the pen. HJB Obstacle Prob.) For i G N 

and y E M^, define G U as in (6.1), and set 



Furthermore, define A'^{y) G M.^^^ and b^{y) G R^ to be matrix and vector 
consisting of 

• rows {A)i and {b)i , i G Af , respectively, j/max{(6 — Ay)i , 0} > 

• and having zero rows i/max{(6 — Ay)i , 0} = 0. 



// n(j/) = max{y,0}, we now have Juiy) = A^maxj-j,-) + pA'^{y) G K"^ , and (6.4) 
becomes 



which is equivalent to 

(6.5) (A„™..(,„) + M+(z"))z"+i = 6„„ 



.(,„)(z")z"-6„™..(,„)) 

-pA+(z")z"-Hp6+(z"), 

p6+(z"). 



From (6.5), it is easy to see that Algorithm 6.5 is well defined. The next theorem 



states that, given certain conditions on the matrix A, local quadratic convergence 
of Algorithm 6.5 to the solution of Problem 5.1 for n(-) = max{-,0} (see also 



Corollary 6.2) can be guaranteed. 



Theorem 6.6. Let (z")5^o be the sequence generated by Algorithm 6.5 There 
exists a constant C > such that, for every starting value z^ , it is ||z"||oo < C , 
n > 1. Furthermore, if A — 1^ , where 1^ S M"'^^^ denotes the identity matrix, 
then there exists a neighbourhood B of Zp such that, for any starting value z^ G B, 
(-z")n>o remains in B and converges to the solution Zp at a quadratic rate. 



of Lemma 



4.6 



Proof. The boundedness of (z")n>i follows from (6.5) by arguing as in the proof 
The quadratic local convergence property of {z^)n>o follows from 
if H is semi-smooth and all V G dH{zp) are non-singular, where 



Theorem 3.2 in 

dH denotes the generalised Jacobian as introduced in [B]. By the assumptions at 
the beginning of this section, H is continuously partially differentiable (and, in 
particular, semi-smooth) everywhere except on the set 2? {y G : 3 i G 
M s.t. yi — {b)i\, since the only critical term is max{6 — Ay, 0} = max{6 — y, 0}. 
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Denoting the directional derivative at y G 
can easily be verified that 

H'{y + h;h)-H'{y; h) 



in direction /i e by H'{y\ h), it 



(6.6) 



lim 

/i->0 



\h\\ 



0, y&V, 



which, by Theorem 2.3 in ^34] , gives semi-smoothness on V, and, since the gener- 
alised Jacobian at b is given by 

dH(b) = Ai : A e [0, 1]} C K"^ , 

we see that all V e dH{b) are in fact non-singular. □ 

Remark 6.7. Since, in practice, Prohlem \5. l\ frequently results from a time- stepping 
routine, the solution from the previous time step can be used as a starting value , 
and, for small enough time-steps, z'^ ^ B as required by Theorems 6.4 and 6.6 can 
be expected to hold. 

7. Evaluating the Continuous Control in Practice 



If we solve Problems 2.2 and 2.3 using penalisation, we have to numerically deal 
with the continuous control, e.g. when computing and in (4.5) 



and (6.1 1 for given x" and z", respectively. Now, if the control is well behaved (as 
in the examples following later in this paper), the exact minimum/maximum can be 
found by differentiating and using analytical techniques; however, this is not neces- 
sarily always possible, i.e. we might have to approximate (A„)„gu and {bu)ueiJ by 
functions that are easier to handle numerically. The following remark states that 
such an approximation is legitimate within the framework of our algorithms. 

Remark 7.1. (Stability in the Control) Suppose we have families of functions 
(A^)Mgu and (fo,^)Mgu , e > 0, satisfying the following properties. 

• For every e > 0, (A^)„gu and (fej^jagu satisfy the same conditions as 
required in the definition of Problem \2.S\ and in the setup of Sections 2] 
and\^ 

• For every u € U, it is limg^o = a'^c^ hnie-s-o b^^ = b^ . 

Suppose we solve Problems 3.2 and 5.1 by using the approximations (A^)„gu and 
(6^)„gU for some e > 0, running any of the algorithms discussed in Sections 
and[^ obtaining solutions x*''^ and z*'*^, respectively. In the limit e — > 0, we have 
x*'-'^ — ?> X* and z*'*^ — >■ z* , where x* and z* , respectively, denote the solutions to 
J^for {Au)uev and (5„)„gu- 



Problems 3.2 and 



Proof. For fixed e > 0, all results from the previous sections hold, and it only 
remains to show that x*'^ — >■ x* and z*'^ — >■ z* as e — >■ 0. Since all involved functions 
are continuous on the compact interval U, the convergence lime_>.o = and 
Wm^^Q 6^ = bu is uniform in u G U; hence, we can reproduce results of Corollary 
|2.6| with constants independent of e > 0, which, following Lemmas |3.4| and |5.3[ 



means that a;*'*^ and z*''^ are bounded independently of e > 0. We then infer the 
existence of converging subsequences, not notationally distinguished, of (a;*'^)e>o 
and (2;*''^)e>o , respectively. Since the convergence in e is uniform in u G U, we may 



swap "hme_>o" and "max^eu" in expressions (3.2) and (5.11, and we see that the 
limits of the sequences {x*'^)t->o and (z*'^)c>o do indeed solve Problems |3.2| and 



5.1 respectively. The uniqueness of the solutions (see Lemmas 3.3 and 5.2 1 means 
that not only subsequences but the whole sequences converge. □ 

We point out that the rather general definition of approximating functions used in 
the above remark includes most common numerical approximations, e.g. piecewise 
constant, piecewise linear and other finite subspace approximations. 
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8. Numerical Results 

Finally, we come to test and analyse the applicability of the previously introduced 
numerical techniques by solving two models from mathematical finance, presented 
in Sections |8.1| and |8.2[ which lead directly to equations as given in Problems |1.1| 



and 1.2 respectively. 



8.1. Example: An Incomplete Market Investment Problem. In this section, 
we solve an incomplete market problem taken from [42] . More precisely, we look 
at an optimal investment model in which an agent has to distribute his money 
between a risk free bond and a risky stock; the market incompleteness arises from 
the stochastic volatility of the stock price process. 

Let 6, a, cr : M — > M be bounded and globally Lipschitz functions, and suppose that 
there exists a constant c, independent of y, such that cj^y) > c. Let T > be some 
finite time horizon. Let /i > r > and Bq , Sq , Yq > 0. Suppose we have a bond 
price process {Bt)o<t<T , a stochastic volatility process {Yt)o<t<T and a stock price 
process {St)o<t<T solving, respectively, 

dBt = rBt dt , 

dVt = b{Yt) dt + a{Yt) dWl 

and dSt = ^JLSt dt + a{Yt) St dW^ , 

where (W7)o<t<T i * G {1j2}, are two Brownian motions defined on a probability 
space (ri, J^, P) with a correlation coefficient g E [—1, 1]. 

Staying exactly in the framework of |31j , we consider an investor who can invest in 
the stock and in the bond. We suppose that the investor has initial wealth Xq > 
and that he may rebalance his portfolio Xt — iTf + Wt at any time t € [0, T] ; here, 
7r° and Wt denote the amounts invested, respectively, in the bond and in the stock. 
The investor's wealth process solves 

dXt = rXt dt+{fi- rj-Kt dt + a{Yt)T:t dWl , t G [0, T], 

and must satisfy Xt>Q for every t e [0,r]. We take the investor's utility function 
to be of CRRA-type and given by 

U{x) = -x'^, X&R, 
7 

for some constant < 7 < 1. Now, trying to maximise the final utility, the investor's 
value function is given by 

(8.1) (t>{x,y,t):=sn^V.[U{XT)\Xt^x,Yt^yl (x, y, e [0, 00) x M x [0, T], 

where A is the set of admissible trading strategies (for details, see |42j). We cite the 
following result, which shows how this utility maximisation problem can be solved. 



Proposition 8.1. The value function introduced in (|8.1|) can he written as 



0(x, y, t) = —^{y, t), {x, y, t) G [0, cx)) x M x [0, T], 
7 



where f solves 

(8.2) ^ [ipt + ^a'^{y)(Pyy + Hyjify] + r^p 



max [^(7 - l)cr2(y)uV + 9'y{y)a{y)u(py + (/i - r)uip] = 0, 
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with ip{y,T) — 1 and U C 
1 satisfies 



an appropriately chosen compact set, and ip{-, •) 



(8.3) (fit + ^a{yYfi>yy + b{y) + q 



1 



- r)a{y) 



(1 - 7)cr(y) 

7(1-7- 



2a2(a)(l-7) 



= 0. 



1-7 

T/ie notion of solution used in this context has to he understood in the viscosity 
sense (cf. |19| j. and cp, (p are the unique solutions to (8.2) and (8.3 1, respectively. 

Proof. The main result can be found in [32] ■ See [3] for details on the existence of 
the viscosity solutions. □ 



In (8.2), we make the assumption of U being a compact set since, for every time 
t € [0,T), the maximum in ( |8.2[ ) is assumed at u = ttj , where (7rj)o<t<T is the 
investor's optimal trading policy (cf. |:42]). which should not reach infinity in a 
meaningful financial model. 



Clearly, from Theorem 
ther equation 



.1 



to find (p, we need to compute ip or (p and solve ei- 
.3[ ) or ( |8.2[ ); we have deliberately chosen a problem which can be 
linearised such that we can obtain a reference solution by standard methods. In 
the next few sections, we will select parameters and present and compare several 
approaches of computing 0. 

8.1.1. Choosing Model Parameters and Functions. We set r = 0.3, /i = 0.7, g = 
—0.2, 7 = 0.5, and T — 1. Furthermore, we introduce ymin ■— n ■— 0.1 and 

Urriax 

:= 1, and, for y G [ymin :V-m.ax\, we use 

a{y) = - 2.5(2; - 0.5 - 0.5k)^ + 2.5(-0.5 + Q.bnf, 
b{y) = -y + 0.55, 
and cr(y) = y. 

Functions a, b and a are shown in Figure [l] they satisfy the technical conditions 
listed in the previous section and guarantee {Yt)t>Q C [k, 1]. 

8.1.2. Discretisation of the Continuous Equations. Numerically, we solve equations 
.2 1 backwards in time, starting at expiry T, on a grid {{ih + K,jk) : 

(1 



(|8.3|) and 

< 



< i < A^, < j < M}, where h := {1 - k)/N and k := T/M. We perform a 
fully implicit finite difference discretisation, using one-sided differences for all first 
derivatives (including the time derivative) and central differences for all second 
derivatives. (For a general overview of basic finite difference concepts for PDEs, 



e.g. see |35j.) In particular, when discretising Txpy (or d(fy), where 5 



denotes the combined coefficient of the first y-derivative in (8.3) (or (8.2)) 



0(2/, u) 
we 



switch between left-sided and right-sided differences in accordance with a positive 
or negative sign of the coefficient 0; this way, we can guarantee the following two 
properties. 

• The tridiagonal discretisation matrix implied by our fully implicit scheme 
has positive entries on the diagonal and non-positive entries on the up- 
per/lower diagonals. 

• Since a{K) — a(l) — and 6(k) > > 6(1), the boundary conditions at k 
and 1 are replaced by finite differences pointing inwards, (cf. [55]). 

Proceeding as just described, the linear parabolic PDF in 
by a simple linear system of equations. Furthermore, taking U to be 
I — 150, our discretisation of equation (8.2 1 matches Problem 2.2 with (|2.l|) and 



3) is approximated 
1,1], with 



(2.2) satisfying all assumptions. 
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An Incomplete Market Problem: Our Choice of a, b,a 




Figure 1. The functions a, h and a as chosen for the volatility pro- 
cess in the investment problem introduced in Section \8.1\ As can easily 
be observed, whenever y approaches k (or 1), a{y) goes to zero and 
b{y) is positive (negative); this guarantees that the diffusion dYt = 
b{Yt) dt + a{Yt) dWt stays in [«:,!]. Additionally, when (numerically) 
solving the two equations ( |8.3[ ) and ( |8.2[ ) on the interval [k, 1], outgoing 
characteristics (cf. [38) ) replace our boundary conditions at k and 1. 



Remark 8.2. Based on our choice of functions a and b (cf. Figure^, the diffu- 
sion (Yt)o<t<T will always stay in [k, 1]. Conceptually, this means that no 'outside' 
information - like Dirichlet boundary conditions - is required for the PDE to be com- 
pletely specified on the interval [n, \], since the flow of information can be thought 
of as coming from 'within'. Mathematically, this means that, by taking a{y) := 0, 
y € M\[k, 1], and using one-sided inwards pointing finite difference stencils at k and 
1, we obtain a consistent (and monotone) discretisation scheme without requiring 
any Dirichlet boundary conditions. 

Remark 8.3. It is generally non-trivial to prove convergence of finite difference 
schemes applied to a (possibly nonlinear) PDE for which only viscosity solutions 
can be shown to exist. The standard reference is [T], where - loosely speaking - it 
is shown that every stable and monotone discretisation converges if the equation 
satisfies a strong comparison principle (cf. |19) |.' other (more specific) approaches 
include |271 l2l[23l l26[ [3l[25j . For the current example, consistency is straightforward 
to prove (e.g. cf. 40] J, stability can be shown following |12j since we have what they 
call a "positive coefficient discretisation" , and a strong comparison result can be 
found in [3] ; hence, altogether, the results of [T] are applicable and convergence to 
the unique viscosity solutions of (8.3) and (|8.2|) can be guaranteed. 



8.1.3. Solution of the Discrete Systems. In this section, we will compare the follow- 
ing three ways of numerically solving the incomplete market problem represented 
by the equations in Theorem |8.1| All computations are done in Matlab. 

• We solve the linear system of equations resulting - for every time step - 
from a discretisation of the linear parabolic PDE in (8.3 1. 
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• We solve Problem 2.2 (corresponding to one time step of a fully implicit 
discretisation of the non-linear parabolic PDE in 



21) by 



the penalty method devised in Sections [3] and |4| of this paper, and by 
— the method of policy iteration. 
The use of policy iteration has been studied in [T^l [30] and is briefly summarised 
in Appendix |Aj Following Section [7| we approximate U = [—1,1] = [—150,150] 
by U := {—150 + : r = 0, 1, . . . , 1000}, thereby discretising U by using a very 
fine grid of 1001 points. (Effectively, in the notation of Remark 7.1, we approxi- 
mate (Atj)tjgu and {bu)ueu by piecewise constant functions.) Numerically, when 
computing candidate solutions to equations (2.3) and (3.2) by policy iteration and 



Algorithm |4.4[ respectively, we terminate the iterations according to the following 
two checks of accuracy, using tol =lc-08. 

l'^ satisfies 



(8.4) 



For (2.3), we terminate if our candidate solution x" e 

1 1 A.. 



< toL 



lloo 



(8.5) 



where u^'^p{x") satisfies A^supf^n^ x^' 
late if our ( 

pmax{6£.^"P(^„ 



For (3.2), we terminate if our candidate solution x^' e 



-bu'^p(x") = mm{Aux'^-bu : u e U}. 

^ satisfies 



I (A 



where Up^P{x") satisfies 6^= 
U} and 



— A,-. 



< tol, 
max{6{,p - Au^ x"' 



Up {x^) 



else. 



Figure [2] shows the impact of the a priori unspecified parameter uq in Problem |3.2| 
on the penalisation error. We first remark that, if uq could be chosen to be the 



optimal control of the discretised problem, the penalisation error in Problem 3.2 
would be identical to zero by construction. Now, clearly, the optimal control is 
unknown, and generally a function of the state variable and time, and, therefore, a 
constant value uq generally gives a non-zero (in fact, negative) penalisation error, 
which we know to converge to zero of first order in 1/p. This is the underlying 
convergence mechanism of the proposed method and does not require a cunning 
choice of uq. Figure [2] does show, however, that the penalisation error for fixed 
p can be reduced by diligent choice of uq. It is thereby sufficient to choose uq of 
the same order of magnitude as the optimal control. One can thus take advantage 
of a priori knowledge of the approximate control size. If such an estimate is not 
available, one might first determine a rough approximation by producing a crude 
version of Figure [2] on a coarse mesh - coarse in parameter space as well as time and 
state space - which is computationally cheap, and pick ug accordingly. We do not 
take advantage of this information in the following computations, and, throughout, 
we use Uq = —I = —150, which appears to be the worst-case choice. We also 
tested the impact of uq on the Newton method, and found the required number of 
iterations virtually unaffected in all settings. 



We use the numerical solution of the linear parabolic PDE (8.3) as a reference 
solution for the incomplete market problem. For M = N ~ 200 and p — le06, 
measured in the maximum norm, the difference between the numerical solution of 



3) and the penalty approximation (3.2) is 2e-03, and the difference between the 



penalty approximation (3.2) and the policy iteration solution of (2.3) is 2e-04. 
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Penalty Scheme: Error in Dependence of 

.x10" 




8.1 



Figure 2. The incomplete market investment problem of Section 
For M = N = 200 and different choices of uq , we see the difference 
between the solution computed by the penalty scheme and the policy it- 
eration method. The error is measured in the max-norm. We observe 
that, for the quality of the penalty approximation, it is advantageous to 
pick Uq close to the optimal control value. 



The results of our numerical tests are summarised in Figures |3] and |4] and in Tables 
ID and m 

In Figure [3j we see the penalty approximation for p — le03. Given that the 
penalty parameter is still relatively small, the penalty approximation is still below 
the reference solution, but, clearly, the curves are similarly shaped already. In 
Figure [4j we measure rate of the convergence in p; more precisely, for different sizes 
of p, starting out with the t = k value of the reference solution (i.e. the value at 
the penultimate time step) , we compute a time zero value using the penalty scheme 
(thus solving exactly one discrete LCP) and compare it to the time zero value of 
the reference solution. As expected, based on Theorem |3.7[ the convergence in p is 
of first order. In Table [TJ we see the number of iterations needed by Algorithm |4.4| 
and policy iteration, averaged over all time steps; for the two schemes, the numbers 
are almost identical, and - in both cases - we never need more than two iterations 
to reach the desired accuracy tol. Finally, in Table [2] we see the computation times 
for the two schemes, which, as is to be expected based on the iteration numbers, 
are virtually the same. 

8.2. Example: Early Exercise Options in an Incomplete Market. In this 
section, we value an early exercise contract in an incomplete market, in which 
the incompleteness stems from the fact that the asset on which the early exercise 
contracts are written is not traded; the example in its entirety is taken from j31| . 
and we use it to demonstrate that the arising non-linear equations can be solved 
by a fully implicit finite difference scheme when using the results of Section [5] 

Let 6, a : M — > M be bounded and globally Lipschitz functions. Let T > be 
some finite time horizon. Let /i, cr > and 5*0 > 0, lo G IR. Suppose we have a 
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An Incomplete Market Problem 




0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

Stochastic Factor y 



Figure 3. The incomplete market investment problem of Section 8.1 
For M = N — 200, we see the solution to (8.31, referred to as "reference 
solution" stnce it avoids the treatment of non-linearities, and the solution 
to the penalised equation (3.2 1 for p = le03; for the current choice of p, 
the penalty approximation is still coarse. 



Policy Iteration 


n=l 


n = 2 


M, N = 5Q 


6% 


94% 


M,N = 200 


11% 


89% 


M = 200, TV 50 


53% 


47% 


M = 50, TV = 200 




100% 


Penalty Method {p = 4e03) 


n=l 


n = 2 


M, TV = 50 


8% 


92% 


TW", TV = 200 


13% 


87% 


M = 200, N = bQ 


49.5% 


50.5% 


M = 50, N ^ 200 




100% 


Penalty Method {p = le06) 


n=l 


71 = 2 


TVf , TV = 50 


6% 


94% 


M, N ^ 200 


11% 


89% 


M = 200, TV = 50 


55% 


45% 


M = 50, TV = 200 




100% 



Table 1. The incomplete market investment problem of Section 
For different time and space grids, we see the number of iterations needed 
by penalty approximation - or, more precisely, by Algorithm \4.4\ when 
solving (13. 21) - and policy iteration. For the different schemes, the num- 



bers are very similar, and, generally, we never need more than two steps. 



traded asset price process {St)o<t<T and a non-traded asset price process (Yt)o<t<T 
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Figure 4. The incomplete market investment problem of Section 8.1 
For M—N=200, we measure the speed of convergence in the penalty pa- 
rameter p. The error is measured m the max-norm. The plot is log-log, 
and we observe a convergence rate of 0.992, close to one, confirming the 
results of Theorem 



3.1 



Grid Size 


Policy 


Penalty {p = 4e03) 


Penalty (p = le06) 


M, N ^50 


1.08s 


1.08s 


1.09s 


M, N = 200 


35.07s 


34.76s 


35.16s 


M = 200, N = 50 


2.74s 


2.82s 


2.76s 


M = 50, N = 200 


10.89s 


10.88s 


10.91s 



Table 2. The incomplete market investment problem of Section 8.1 
For different time and space grids, the computation times needed by 
penalty approximation and policy iteration. In all cases, the compu- 
tational effort of the two schemes is very similar. 



solving, respectively, 

dSt = piSt dt + a St dWl 
and dYt = h{Yt) dt + a{Yt) dW^ , 

where (W^()o<i<T i * G {li2}, are two Brownian motions defined on a probability 
space (fi, J^, P) with a correlation coefficient g E [—1, 1]. Additionally, we assume 
the existence of a riskless bond with interest rate r = 0. 

Similar to the previous example, we consider an investor who can invest in the 
traded asset and in the bond. We suppose that the investor has initial wealth 
Xq G M and that he may rebalance his portfolio Xt = 7r° + 7rt at any time t G [0, T]; 
here, 7r° and ttj denote the amounts invested, respectively, in the bond and in the 
stock. The investor's wealth process solves 



dXt = pnt dt + aiTt dW} , i G [0, T] . 
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We take the investor's utility function to be of exponential type and given by 



U{x) 



with risk aversion parameter 7 > 0. Now, we suppose the investor holds an early 
exercise contract with payoff P{y), y G M., on the non-traded asset, and we would 
like to find the indifference price (cf. [31 ) of the instrument; we cite the following 
result. 

Proposition 8.4. The buyer's early exercise indifference price ^{y, t), where (y, t) G 
M X [0,T], is the unique bounded viscosity solution to 

(8.6) min { - i/^* - + - g^)a^iy)i^l ,^ - Piy)} 

where ■0(-,r) ~ P{-) and 



0, 



:= ^a'^{y)i'yy + {b{y) ~ Q-^a{y))i'v ■ 



Proof. The main result can be found in [31]. See [3] for details on the existence of 
the viscosity solution. □ 

It can easily be shown that ijjl = max„gR{2u'0i/ — u^}, and, if we assume ijjy to be 



bounded, (8.6) can be rewritten as 

(8.7) min { max{£^ V : w e U}, V' - P{,y) } = 0, 

where U C M is a suitably chosen compact set and, for m e U, we define 



4V := -A - 7^a2(y)V: 



Hence, to compute the early exercise indifference price of the considered option, we 



have to solve (8.7 1, which has the same structure as \1.2 



8.2.1. Choosing Model Parameters and Functions. We set fi/a = 1, g = 0.1, 7 = 1 
and T = 1. Furthermore, we introduce ymin ■— and ymax ■— 5, and, for y G 



[yn 



, ymax] , we use 



aiy) = y, 

b{y) = 0.3y 
and P{y) — max{l — y, 0}. 

In particular, the choice of P{-) means that we are dealing with an American put 
with strike one. 

8.2.2. Discretisation of the Continuous Equations and Solution of the Discrete Sys- 
tems. Numerically, we solve (8.6) and (8.7) similarly to Section 8.1.2 i.e. we pro- 



ceed backwards in time, starting at expiry T, on a grid {{ih,jk) : < i < < 
j < M}, where h := 5/N and k :— T/M. In space, we again apply a finite differ- 
ence discretisation guaranteeing for the discretisation matrices to be in K'^ . Since 
we are dealing with a put option, we use ip{0) — 1 and '0(5) = as boundary 
conditions, and we take the set U in (8.7) to be [—1,0]. We compare the following 
three numerical approaches in Matlab. 

• For (8.61, we use an explicit time stepping scheme, meaning all non-linearities 



can be dealt with easily. 



• We solve Problem 2.3 (corresponding to one time step of a fully implicit 
discretisation of the non-linear parabolic PDE in 

of this paper, and by 

or isni)- 



the penalty method devised in Sections [s] andl6 
the method of policy iteration (see Appendix \B 
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As already in Section 8.1.3 we employ Remark 7.1 and approximate U — [—1,0] 



- 1 



-1 + 



ini 



0, 1, . . . 101}. When solving the penalised equation (5.1 ) 



by Algorithm 6.5 we use a test for accuracy of the kind (8.5), setting tol =le-08 



as before. Similarly, we use a test for accuracy of the kind (8.4 1 with the same 



tolerance when solving an equation of the form (2.4) by policy iteration. 



Early Exercise in an Incomplete Market 




1 2 3 4 5 ' 0.1 0.2 0.3 0.4 0.5 

Non-Traded Asset y Non-Traded Asset y 



Figure 5. The incomplete market early exercise pricing problem of Sec- 
tion 8.2 We see the penalty approximation (for M = N — 200 and 
p = 1) and the solution of the explicit scheme (for M = 4e04 and 
N = 200^. Even though the penalty parameter is still very small, the 
penalty solution already seems to be a reasonable approximation. 



Remark 8.5. As already in Remark\8.3[ convergence of the fully implicit discreti- 



sation of (8.7) to the unique viscosity solution can be guaranteed by noting that 



we have stability, monotonicity, and consistency of the discretisation, and by us- 



ing a strong comparison principle (cf. [3]/ The fully explicit discretisation of (8.6) 
converges similarly provided we have stability. 

In our numerical tests, we find the explicit scheme to require a relatively large num- 
ber of time steps for stability, making it difficult to use. In Table[3j fixing p =le06, 
we see the difference between the explicit scheme and the penalty approximation for 
different grid sizes; for the explicit scheme, for a given space discretisation, we al- 
ways choose the number of time steps such that the solution plot does not show any 
instabilities, whereas for the penalty scheme we take identical numbers of time and 
space steps. For = 50 and A^ = 200, the explicit and the penalty scheme differ by 
1.5e-03 and 3.6e-04 in the max-norm, respectively (cf. Table|3|. We point out that 
the explicit scheme runs substantially longer due to the high number of time steps 
required for stability; picking the time and space steps proportional to each other 
for the fully implicit scheme is optimal experimentally because of the observed first 
order convergence in both time and space. In Figure [5] we see the penalty approx- 
imation for p = 1 and the explicit solution; even though the penalty parameter is 
very small, the two graphs are extremely close. The difference in the max-norm 
between the policy iteration and penalty approximation solutions is 1.6165e-05 and 
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Explicit Scheme 


Time 


Penalty (p = le06) 


Time 


Difference 


M = 2500, iV = 50 


0.78s 


M = 50, N = 50 


0.23s 


1.3e-03 


M = 4e04, N = 200 


16.49s 


M = 200, N = 200 


3.71s 


3.5e-04 



Table 3. The incomplete market early exercise pricing problem of Sec- 
tion \8.S\ For different time and space grids, we see the difference be- 
tween the explicit scheme and the penalty method, and the respective 
runtimes. The explicit scheme runs much longer due to the high number 
of time steps needed to guarantee stability for a given space discreti- 
sation. Furthermore, since our space discretisation contains one-sided 
differences for reasons of monotonicity, the expected consistency order 
is 0{1/M) + 0{1/N), which also makes the choice M = N desirable. 



Policy Iteration 


Max Iterations 


Iterations 


Runtime 


M, N = 50 


4 


2.20 


0.38s 


M,N = 200 


11 


2.17 


6.42s 


M = 200, TV = 50 


4 


1.83 


0.86s 


M = 50, iV = 200 


18 


2.88 


2.12s 


Penalty Method {p = 4e03) 


Max Iterations 


Iterations 


Runtime 


M, N = 50 


3 


1.98 


0.25s 


M, N = 200 


3 


1.21 


4.09s 


M = 200, TV = 50 


3 


1.15 


0.66s 


M = 50, N = 200 


4 


2.16 


1.47s 


Penalty Method {p = le06) 


Max Iterations 


Iterations 


Runtime 


M, N = 50 


2 


1.10 


0.17s 


M, N = 200 


3 


1.08 


3.82s 


M = 200, TV = 50 


2 


1.02 


0.61s 


TW = 50, TV = 200 


4 


1.38 


1.13s 



Table 4. The incomplete market early exercise pricing problem of Sec- 
tion \8.2\ For different time and space grids, we see the number of itera- 
tions needed by Algorithm\6.^when solving the penalised eguation (|5.1[). 



Independently of the grid size, the absolute and the average number of 
reguired iterations is small. 



2.601 le-05 for grid sizes Af = TV = 50 and TW" = TV = 200, respectively. In Table 
|4j for different grid sizes and penalty parameters, we see the maximum and the 
average number of iterations needed by Algorithms |6.5| (penalty approximation) 
and |B.1| (policy iteration) for solving the discrete systems at every time step, as 



well as the corresponding runtimes for the full schemes. (In case of Algorithm B.l 



we count one iteration whenever line (B.2) is executed.) Throughout, the average 
numbers of iterations are small, and both schemes run very fast, with the penalty 
scheme being faster by about a factor two; the effect appears to be due to the fact 
that - whilst the average number of iterations is small - policy iteration requires 
a large number of iterations in a few instances (as seen by the Max Iterations in 
Table |4]); we will investigate this effect more closely below. Finally, in Figure |4] 
we measure the rate of convergence in p, and confirm first order convergence as 
predicted by Theorem |5.6[ the implementation is precisely as in Section [SA] except 
that we use a p = le08 penalty approximation as reference solution; as before, the 
error is measured in the wax-norm. 



8.2.3. Sensitivity with Respect to the Initial Guess. We have seen above that, unlike 
penalty approximation, policy iteration requires many iterations in a few instances 
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Convergence in the Penalty Parameter 




1000 2000 3000 4000 50006000 

P 



Figure 6. The incomplete market early exercise pricing problem of Sec- 
tion \8.S\ For M—N—200, we measure the speed of convergence in the 
penalty parameter p. The error is measured m the max-norm. The 
plot is log-log, and we observe a convergence rate of 0.910, close to one, 
confirming the results of Theorem 



5.6 



and that this effect appears to correlate with the grid size N (cf. Table |4]). To 
investigate if the phenomenon relates to the quality of the initial guess for the non- 
linear iterations, we set M = 1 and consider different grids sizes N; the results can 
be seen in Figure [7j (Since, in our implementations, we use the solution from the 
previous time step as initial guess for the next, setting M = 1 can be interpreted 
as solving a single non-linear discrete system with a poor initial guess.) Clearly, 
the number of Newton iterations for the penalised system is almost unaffected by 
the increase of N , whereas the number of iterations of the policy iterations grows 
linearly in N. (In ■''>() . it has already been observed that even for a simple American 
option problem, the best obtainable bound on the number of iterations of policy 
iteration is linear, i.e. 0{N).) 



Now, it is easy to see that equation (B.l) is scalable, i.e., for any (5 > 0, it can 
equivalently be rewritten as 

min <^ max{Au z — 6„}, S{Az — b) > — 0, 

and it has been pointed out in [TSl [T71 [T5] that a different choice of 6 will generally 
lead to different policy iterations; more precisely, in our context, depending on the 



choice of S, we obtain different adaptations of Algorithm B.l all converging to the 
same solution. For theoretical considerations on the best choice of 6, we refer to 
|17j : numerically, we find the following. 



• Simply introducing a scaling factor 6 does not yield an improvement. 
Changing the 
improvement. 
Using a scalin 

icantly reduces the number of iterations needed (cf. Figure [?]). 



• Changing the initial guess from the payoff P{-) to z*^ = 1 does not yield an 

• Using a scaling factor 6 — le06, combined with initial guess z° = 1, signif- 
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Further analysis shows that a large scaling factor 6 yields an improvement whenever 
the initial guess is such that Az^ — b > 0, which is necessary for the multiplication 
by the scaling factor to have an effect. 

In summary, we can conclude that the penalty approximation appears to have a 
generic advantage when dealing with poor starting values, whereas - to obtain 
equally good results by policy iteration - prudent implementation is inevitable. 
The main reason for the different performance of policy iteration seems to be that 
it does not show Newton-type behaviour, i.e. it does not converge to the solution 
in steps with rapidly decreasing size; in particular, when using the payoff as initial 
guess, it shifts the solution upwards node by node until the free boundary is found, 
resulting in the linear dependence on N observed in Figure [7] 




Figure 7. The incomplete market early exercise pricing problem of Sec- 
tion \8.S\ For M — 1 and varying N, we measure the number of non- 
linear iterations required by (scaled) policy iteration and penalty approx- 
imation, respectively. 



9. Conclusion 

In this paper, we consider the numerical solution of continuously controlled HJB 
equations and HJB obstacle problems - which trivially includes finitely controlled 
equations - and we show that penalisation is a powerful means for solving the 
non-linear discrete problems resulting from implicit finite difference discretisations. 
Generally, this can be done by policy iteration or - as we show - by penalisation 
combined with a Newton-type iteration. For both penalty approaches, we show 
that the achieved accuracy is 0(l/p), where p is the penalty parameter. We include 
numerical examples from (early exercise) incomplete market pricing, demonstrating 
the competitiveness of our algorithms as fast and easy-to-use numerical schemes. 
An interesting open problem is the extension of our approach to more general Isaacs 
equations. 
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Appendix A. Policy Iteration for the HJB Equation 

We briefly recap the policy iteration algorithm for HJB equations as introduced in 
[T^ and [3D]. Recall that we want to solve Problem 2.2 i.e. we are trying to find 
X e such that 

(A.l) mm{Au x - : u e V} = 0. 

Algorithm A.l. (Policy Iteration for HJB Eq.) For i E JV and y £ M.^ , define 
^^''"(y) e U to 6e such that 

ur^'iy) = argmin{(A„2/ - : w e U}, 

and set A™^{y) G E^^^ and b™^{y) e to be matrix and vector consisting of 
rows {A^min(^y-^)i and , i E JV, respectively. Let x'^ G be some starting 

value. Then, for known x" , n > 0, find x"^^ such that 

(A.2) A™*"(a;") x"+^ = 



Theorem A.2. Let {x")n>o be the sequence generated by Algorithm A.l We have 
rj.n+1 > j-g^ n> 1. As n ^ oo, x" converges to a limit z* which solves (A.l). 

Proof See [12] or [30]. □ 

Appendix B. Policy Iteration for the HJB Obstacle Problem 
We briefly recap the method of policy iteration algorithm for HJB obstacle problems 



that was introduced in [30] ■ Recall that we want to solve Problem 2.3 
trying to find z S M.^ such that 



I.e. we are 



(B.l) min <^ max{Au z — Az — b> 



0. 



Now, for u G U, we define A^fi := A^ and Au.i := A, and we define bu,Q and bu,i 
correspondingly. Using these new definitions, (B.l I is equivalent to 



min I ma.x{Au.v z - bu^y} } 



= 



Algorithm B.l. (Policy Iteration for HJB Obstacle Prob.) Fori € Af and z € M^, 

define ?;™"(z) e {0, 1} such that 

max(A«,i,"'"(z) z - fo„,i,"""(^))i = niin I max(A„,„ z - bu,v)t\- 

Let G be some starting value. Then, for known z^ , n > 0, find z"+^ such 
that 

(B.2) max(A„.,™(,„)z"+i-6„,„™,„(,)), = 0, leAf. 



Theorem B.2. Let (z"')n>o the sequence generated by Algorithm B.l We have 
^.n+i > n > 1. As n — >■ 00, z" converges to a limit z* which solves (B.l). 

Proof See [30]. □ 
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